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The  Indexing  Properties  Of  An  Ancillary  Statistic 
Anthony  Y.C.  Kuk 

1.  INTRODUCTION 


Approximate  ancillarity  has  recently  been  the  subject  of  substantial 
work  (Efron  §  Hinkley  1978;  Cox  1980;  Hinkley  1980;  Bamdorff-Nielsen 
1980;  Amari  1982b).  While  this  topic  is  undoubtedly  important,  we  believe 
that  the  concept  of  exact  ancillarity  has  not  yet  been  fully  explored. 

The  usual  presentation  consists  mainly  of  examples  of  ancillary  statistics, 
on  the  basis  of  which  certain  statements  are  made.  These  statements, 
though  often  appealing,  are  hard  to  make  precise.  The  purpose  of  our 
study  is  to  clarify  some  of  the  properties  of  an  ancillary  statistic. 


In  §2,  we  reexamine  some  of  the  examples  and  statements  in  a  way 
relevant  to  our  subsequent  discussion.  In  S3,  we  introduce  transformation 
models  and  point  out  that  most  of  the  known  examples  of  ancillary 
statistics  fall  within  this  category.  -  We  also  give  an  interpretation 
of  the  statement:  "Two  samples  with  the  same  ancillary  statistic  value 
contain  equal  amount  of  information."  In  §4,  we  discuss  the  role  of 
an  ancillary  statistic  as  a  precision  index.  For  transformation  models, 
and  also  for  exponential  models,  for  which  ancillary  statistics  also 
exist,  we  find  that  the  variance  of  the  conditional  Fisher  information 


is  proportional  to  the  square  of  the  statistical  curvature  y  (Efron, 
1975) .  Thus  the  magnitude  of  ya  is  a  measure  of  the  effect  of 
conditioning.  Moreover,  the  constancy  of  y&  as  a  function  of  0  appears 
related  to  the  concept  of  exact  precision  index  (Buehler,  1982) .  We 


conclude  in  §5  with  some  miscellaneous  remarks. 


2.  EXAMPLES 


The  standard  examples  used  to  introduce  the  concept  of  ancillary 
statistics  and  the  conditionality  principle  are  those  of  random  sample 
site,  of  two  measuring  instruments  and  the  mixture  problem  (Cox  6  Hinkley 
1974,  pp.32,38).  These  examples  are  intended  to  suggest  that  the  observed 
value  of  the  ancillary  statistic  describes  the  part  of  the  total  sample 
space  relevant  to  the  problem  at  hand,  and  that  inference  about  the 
parameter  should  be  conditioned  on  that  value.  In  the  standard  examples, 
the  fact  that  some  other  sample  size,  some  other  instrument  or  some  other 
distribution  might  have  been  used,  but  actually  was  not,  is  irrelevant. 
While  these  examples  may  seem  artificial,  similar  situations  arise  more 
subtly  in  other  problems  of  statistical  inference. 

Example  1.  Fisher's  normal  circle.  Efron  (1978),  Efron  8  Hinkley 
(1978).  Let  X^,  ...,  Xr  be  independent  observations  from  a  bivariate 
normal  distribution  with  mean  vector  (pcos8  ,psin8)  lying  on  a  circle  of 
given  radius  p  and  with  identity  covariance  matrix;  then  X  *  EX^  is 
sufficient.  If  X  has  polar  coordinates  (6,r),  then  8  is  the  maximum 
likelihood  estimate  of  0  and  r  is  ancillary.  It  can  be  shown  that 
Ea{(§-8)2}  and  E„ { (6  -  9) 2 | r}  are  both  independent  of  0  and  that  the 

V  V 

latter  is  a  decreasing  function  of  r,  so  that  the  accuracy  of  8  improves 
as  r  increases. 

Example  2.  Normal  mean  with  known  coefficient  of  variation. 

Hinkley  (1977).  Let  Xj ,  ....  X^  be  independently  N(y,b2y2)  where  b  is 
a  known  constant  and  y  >  0;  then  (EX^EX2)  is  sufficient  and  C  =  (EX?)*Vex 
is  ancillary.  Hinkley  (1977)  shows  that  the  unconditional  and  conditional 
Fisher  information  about  8  =  log  y  are  both  free  of  8  and  that  the 
conditional  information  is  a  decreasing  function  of  c. 


Examples  1  and  2  suggest  that  even  though  an  ancillary  statistic 
by  itself  carries  no  information  about  6,  it  is  of  value  when  used  in 
conjunction  with  some  other  statistic.  To  be  more  precise,  suppose  that 
the  minimal  sufficient  statistic  can  be  written  as  S  *  (T,C)  where  C  is 
the  ancillary  part  and  T  is  the  so-called  information  carrying  part. 
Although  C  contains  no  information  about  0,  it  constitutes  part  of  the 
minimal  sufficient  statistic.  A  good  way  to  utilise  C  is  to  carry  out 
conditional  inference,  and  so  the  role  of  C  as  a  precision  index  becomes 
relevant.  The  expression  of  S  in  the  form  (T,C)  can  happen  only  when  a 
model  is  not  complete  because  by  a  theorem  of  Basu  (1955,  1958) ,  if  S  is 
boundedly  complete,  then  S  cannot  contain  any  ancillary  component. 
Related  to  this  discussion  is  a  result  of  Lehmann  (1981)  which  can  be 
summarized  roughly  by  saying  that  the  various  forms  of  completeness  of 
a  sufficient  statistic  S  characterize  the  success  of  S  in  separating  the 
informative  part  of  the  data  from  that  part  which  by  itself  carries 
little  or  no  information. 

In  other  examples,  it  is  obvious  that  some  data  values  are  more 
informative  than  others,  as  in  the  example  of  random  sample  size  (Efron, 
1978).  Another  example  is  the  following. 

Example  3.  Location  parameter  of  a  uniform  distribution.  Let 

Xj,  ...,  Xn  be  independently  U(0  -  *  h);  then  the  minimal  sufficient 

statistic  is  the  pair  of  extreme  order  statistics  (X,,.  ,X,  .)  and  an 

U)  In) 

ancillary  statistic  is  the  sample  range  C  =  X^  -X^^.  If  the  observed 
c  is  close  to  l,  we  can  almost  pinpoint  6  whereas  if  c  is  close  to 
zero,  the  sample  is  relatively  uninformative. 

Fisher  (1935,  p.48)  wrote:  "Ancillary  statistics  are  only  useful 
when  different  samples  of  the  same  size  can  supply  different  amounts  of 


information,  and  serve  to  distinguish  those  which  supply  more  from  those 
which  supply  less."  Thus  a  useful  ancillary  statistic  divides  the  sample 
space  into  equally  informative  subsets.  Since  some  samples  are  more 
informative  than  others,  we  should  not  average  over  the  whole  sample 
space  to  obtain  an  inference,  but  only  over  those  samples  that  contain 
the  same  amount  of  information.  This  leads  again  to  the  conditionality 
principle.  It  is,  however,  very  hard  to  make  precise  the  above  idea,  as 
remarked  by  Efron  (1978) :  "So  far,  it  has  proved  impossible  to  codify 
this  statement  in  a  satisfactory  way."  In  the  next  section,  we  will 
give  one  interpretation  of  this  statement  within  the  context  of 
transformation  models. 

Fisher's  statement  also  leads  to  the  idea  that  the  better  an 
ancillary  statistic  is  in  distinguishing  the  more  informative  samples 
from  those  that  are  less  so,  the  more  useful  it  is  as  a  conditioning 
variable.  An  implementation  of  this  criterion  for  choosing  an  ancillary 
has  been  provided  by  Cox  (1971) ,  who  compares  different  ancillaries  in 
terms  of  var(IQ(C)},  the  variance  of  the  conditional  Fisher  information. 
Properties  of  this  criterion  are  studied  by  Becker  5  Gordon  (1983).  In 
S4,  we  shall  see  that  for  both  transformation  and  exponential  models, 
var{I0(C) }  is  proportional  to  y*,  the  square  of  the  statistical  curvature. 


s 


3.  TRANSFORMATION  MODELS 

3.1  Anoillarity  of  maximal  invariant  statistic 

Suppose  that  we  have  a  problem  invariant  under  a  group  of 
transformations  G,  and  that  preliminary  reduction  by  sufficiency  has 
already  taken  place  so  that  G  acts  on  the  space  of  sufficient  statistics 
S  with  maximal  invariant  statistic  C(S).  Let  G*  be  the  group  of  induced 
transformations  on  the  parameter  space  and  v(0)  the  corresponding 
maximal  invariant  function.  If  v(0)  is  a  constant  function,  then  C(S) 
is  ancillary.  Mathematically,  if  v(0)  is  a  constant  function,  the  model 
is  equivalent  to  a  structural  model  (Fraser  1968)  even  though  the  emphasis 
is  quite  different.  Bamdorff-Nielsen  (1980)  calls  such  models  trans¬ 
formation  models. 


3.2  Examples 

Most  of  the  examples  of  ancillary  statistics  actually  fall  under 
this  category.  Suprisingly,  we  have  not  been  able  to  find  any  mention 
of  this  fact  in  the  literature;  instead,  we  find  examples  scattered 
around  and  treated  separately.  It  seems  worthwhile,  therefore,  to 
look  at  some  of  those  examples  from  our  present  point  of  view,  starting 
with  the  list  given  by  Buehler  (1982).  In  addition  to  our  Examples  1, 

2  and  3,  we  find  the  following. 

Example  4.  Sprott  (1961).  Let  Xj,  ...,  X  be  independently 
k© 

Expfae  ) ,  where  a  and  k  are  known  constants.  The  X's  and  Y's  are 

independent.  This  problem  remains  invariant  under  transformations 
bk 

X!  *  Xf  +  b,  Y!  =  e  Y ,  or  equivalently  logYl  =  logY^  ♦  bk  (i«l . . 

j*l,...,m).  The  minimal  sufficient  statistic  is  (Sj  ,S2)  =  (ZX^.ZYj), 
the  maximal  invariant  statistic  is  C(Sj ,S2)  *  Sj/n  -  (logS2)/k  and  v(0) 
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is  a  constant  function.  Hence  C  is  ancillary. 

Example  S.  Fisher's  garma  hyperbola.  Let  X  <v  Exp(6),  Y  'v-  Exp(l/0) 

independent  of  X.  Observe  n  pairs  (Xi,Yi),  ....  (X  ,Y  ).  This  problem 

n  n 

remains  invariant  under  transformations  X!  =  bX. ,  Y!  =  Y./b,  b  >  0 

1  l  i  i 

(i*l,...,n).  The  sufficient  statistic  is  CS  x  =  (EX^EY^),  the 
maximal  invariant  statistic  is  C(Si,S2)  =  S1S2  and  v(0)  is  a  constant 
function.  Hence  C  is  ancillary. 

The  following  example  seems  to  be  new. 

Example  6.  A  special  case  of  a  correlated  bivariate  gamma 
density  given  by  Bamdorff-.Nielsen  (1980)  is 

f(x,y;a,9)  =  (a0  -  1)  I0(2«/xy)exp(-ax  -  0y)  (x>0,y>0).  (3.1) 

The  parameter  space  is  a>0,  6>0,  aS  >  1  and  Io(v'u)  is  the  Bessel 
function  Iu-'/(j!)2.  If  a,  9  are  restricted  by  a6  =  a  where  a>  1,  the 
bivariate  density  becomes 

f(x,y;9)  =  (a  -  1)1 0  (2*/xy)  exp  (-ax/ 0  -  0y)  (x>0,y>0).  (3.2) 

If  we  have  n  independent  pairs  of  bivariate  observations  (X^Yj),  .... 

(Xn>Yn)  from  density  (3.2)  ,  then  the  problem  remains  invariant  under 

transformation  Y!  =  bY.  ,  X!  =  X.  /b ,  b  >  0  (i*l,...,n).  The  sufficient 
1  11  1 

statistic  is  (Si,S2)  =  (EX^.EY^),  the  maximal  invariant  statistic  is 
C(S1#S2)  =  SjS2  and  v(0)  is  a  constant  function.  So  C  is  ancillary. 

The  mathematical  equivalence  of  transformation  models  and  structural 
models  can  also  be  used  to  construct  new  examples  of  ancillary  statistics. 
There  is,  however,  a  slight  complication  since  the  structural  approach 

is  not  concerned  with  sufficiency  reduction.  For  example,  if  Xj . X^ 

are  independently  N(u,l)  and  we  do  not  reduce  by  sufficiency. 
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(X(2)  '  X(1) 


X.^)  is  ancillary  but  if  we  reduce  Xj , 


...»  XR  to  the  sufficient  statistic  S  =  ZX^  an  ancillary  statistic 
no  longer  exists.  It  follows  from  our  discussion  in  §2  that  we  should 
look  for  a  structural  model  which  is  not  complete. 


Example  7.  Progression  model.  The  progression  model  is  a 
multivariate  structural  model  introduced  by  Fraser  (1968,  p.139).  We 
consider  the  simplest  case  with  dimension  two,  the  group  G  indexed  by 
only  one  parameter  and  the  error  distribution  normal.  The  model  is 
as  follows: 


xil 

r\  0 

Eil- 

Xi2 

T  1 

Ci2 

(i  =  1 . n) 


where  ej,  ...,  are  independently  N2(0,l2). 

The  model  is  also  equivalent  to  the  following: 


Xif 

0’ 

1  T  ' 

Xi2 

'v  N( 

o. 

t  | 

1 

T  1  T2. 

)  (i  =  1,. . .  ,n) , 


where  Xi  (i  =  l,...,n)  are  independent.  The  sufficient  statistic  is 
then  S  =  anc*  ^Xil  *s  °^vi°us^y  ancillary.  One  implication 

is  that  S  is  not  complete  which  is  in  fact  the  case. 

3.3  Equivariant  estimation 

We  have  seen  that  if  we  have  a  problem  that  remains  invariant 
under  G  and  such  that  v(0)  is  a  constant  function,  then  the  maximal 
invariant  statistic  C(S)  is  ancillary.  If  we  have  an  equivariant 
estimate  6  of  6  and  a  loss  function  L ( B , 9)  that  remains  invariant 
under  G* ,  then,  by  a  standard  result,  the  risk  F.  {L(0,§)}  does  not 


depend  on  6.  By  a  similar  argument,  we  can  show  that  the  conditional 
risk  Ee{L(6,6) |C}  also  does  not  depend  on  6.  Denote  these  by  R  and 
R(C)  .  Then  R  =  E{R(C) }  and  the  fact  that  6  is  not  involved  simplifies 
interpretation.  We  have  seen  an  application  of  this  result  in  Example  1. 

3.4  An  interpretation 

Within  the  context  of  transformation  models,  we  can  also  give 
an  interpretation  to  the  statement  that  two  samples  with  the  same 
ancillary  statistic  value  contain  equal  amounts  of  information.  In 
doing  so,  we  make  use  of  fiducial  distributions  which  are  distributions 
on  the  parameter  space.  Buehler  (1982)  suggests  using  a  more  neutral 
term  "induced  distribution".  It  seems  inevitable  that  a  fiducial  or  a 
similar  kind  of  argument  is  needed,  because  of  the  frequentist  definition 
of  information  as  an  average  over  the  sample  space  or,  in  the  case  of 
conditional  information,  part  of  the  sample  space.  Thus  it  is  difficult 
to  talk  about  information  contained  in  a  sample  because  we  cannot  condition 
on  the  observed  sample.  As  remarked  by  Efron  (1978),  "This  is  impossible 
in  the  frequentist  framework,  since  if  we  reduce  our  averaging  set  to 
one  data  point,  there  is  nothing  left  to  average  over."  This  is  where 
fiducial  theory  may  help,  because  the  method  of  fiducial  probability 
aims  to  get  probability  statements  about  parameters  without  the  use  of 
Bayes'  formula;  probabilities  concerning  pivotal  quantities  are  inverted 
into  formal  statements  about  parameters.  While  this  approach  creates 
much  controversy,  meaningful  results  are  obtainable  when  applied  to 
transformation  models.  The  interested  reader  is  referred  to  Zacks 
(1971,  §§7.3-4). 

The  kind  of  interpretation  that  we  are  aiming  at  can  be  illustrated 
by  Example  3.  Clearly  §  =  (X^  +  Xjnj)/2  is  an  equivariant  estimate  of 
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0  with  respect  to  the  translation  group  and  S  =  (6,C)  is  minimal 
sufficient.  The  statement  §|C  =  0^0  +  U(-(l-c)/2,  (1 c)/2)  can  be 
inverted  formally  to  0  0  +  U(-(l-c)/2,  (l-c)./2),  the  induced 

distribution  of  0.  Thus  if  we  have  two  samples  (6,c)  and  (0  +  a,c), 
their  corresponding  induced  distributions  are  simply  translations  of 
one  another.  As  a  result,  whatever  amount  of  support  the  sample  (0,c) 
gives  to  any  particular  value  of  0,  say  00,  the  sample  (0  +  a,c)  will 
give  the  same  amount  of  support  to  0q  +  a. 

The  general  case  is  similar.  We  assume  that  the  minimal  sufficient 
S  =  (§,C),  where  §  is  an  equi variant  estimate  and  C  is  the  ancillary 
statistic  which  in  this  case  is  also  the  maximal  invariant  statistic. 

If  we  have  two  samples  sj  =  (§i,c),  S2  =  (02>c)  with  the  same  ancillary 
statistic  value,  ©2  and  0j  are  necessarily  related  by  02  =  g*@i  for 
some  g*  e  G*.  Let  f C©i I s j ) ,  f(02|s2)  be  the  induced  densities;  then 
they  are  similarly  related  by  g*.  To  be  precise,  if  Oj  has  density 
f  C  9i I  Si )  and  02  =  g*0],  then  02  has  density  f(02|s2). 

Most  authors  consider  the  maximum  likelihood  estimator  as  the 
information  carrying  part  of  the  minimal  sufficient  statistic.  In  our 
discussion,  any  equivariant  estimator  can  be  used  as  the  information 
carrier.  Since  the  maximum  likelihood  estimator  is  equivariant,  our 
treatir  t  is  more  general. 


j  i  •-» '  •  -  ’-»  • 
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4.  PRECISION  INDEX  AND  STATISTICAL  CURVATURE 
4.1  Exact  precision  index 

In  discussing  the  role  of  an  ancillary  statistic  as  a  precision 
index,  Buehler  (1982)  calls  C  an  exact  precision  index  if  for  some 
parametrization 

f ( Q | c ; 6)  =  f o (8  -  6  | c)  ,  (4.1) 

where  8  is  the  maximum  likelihood  estimate.  In  other  words,  C  is  an 
exact  precision  index  if  conditionally,  8  is  a  location  parameter.  An 
implication  of  this  definition  is  that  the  unconditional  and  conditional 
Fisher  information  both  do  not  depend  on  6.  This  again  simplifies 
interpretation.  Example  2  is  an  illustration. 

4.2  Transformation  models 

As  remarked  by  Barndorff-Nielsen  (1980),  for  transformation 
models,  it  is  no  essential  restriction  to  assume  that  0  is  conditionally 
a  location  parameter.  It  follows  that,  for  transformation  models,  the 
ancillary  statistic  C  is  an  exact  precision  index,  and  the  statistical 
curvature  y  is  constant.  In  passing,  we  mention  that  Amari  (1982a) 
defines  a  one-parameter  family  of  affine  connexions  with  associated 
curvature.  Efron’s  (1975)  definition  corresponds  to  the  exponential 
connexion  and  is  called  by  Amari  the  exponential  curvature. 

Let  L^(X)  be  the  log  likelihood  function  of  X  =  (Xi.-.-.X^)  and 

denote  its  first  and  second  derivative  with  respect  to  8  by  (.^(X)  and 

£.fX).  The  total  Fisher  information  is  =  Eg{-£  (X)}  whereas  the 

conditional  Fisher  information  T  (C)  is  defined  as  the  Fisher  information 

0 

of  the  conditional  distribution  of  X  given  C.  Cox  (1971)  proposes  the 


use  of  vardgCC)}  as  a  measure  of  the  extent  to  which  the  ancillary 
statistic  C  divides  the  possible  data  points  into  relatively  informative 
ones  and  relatively  uninformative  ones.  In  our  present  case,  j[n\ 

I0(C),  and  Yg  are  all  free  of  0  and  we  denote  them  by  1(C)  and  y 

respectively.  It  can  be  shown  that  *^n(I  (C) /I -1}  — *-N(0,y2).  Without 
loss  of  generality  we  may  assume  I ^  =  n,  so  that  under  regularity 
conditions  ,var{I(C)  }/n  — y2  (n  ■*  ») .  Thus  the  statistical  curvature 
is  a  measure  of  the  effect  of  conditioning,  in  that  the  larger  y2  is, 
the  larger  is  the  difference  between  conditional  and  unconditional 
inference.  Our  results  can  be  summarized  as  follows. 

THEOREM  1.  For  transformation  models , 

(i)  the  statistical  curvature  y  does  not  depend  on  0, 

(ii)  the  ancillary  statistic  C  is  an  exact  precision  index , 

(iii)  under  regularity  conditions 

•Gpf-  il-N(0.Y*), 

(iv)  under  regularity  conditions 

ni»  k  var{I(C)}  =  Y2- 

Proof,  (iii)  Without  loss  of  generality,  =  n.  Efron  §  Hinkley  (1978) 

prove  that 

-*fi(x) 

*fi{ — - - 1}  — ♦  N(0,y2)  , 

where  &  is  the  maximum  likelihood  estimate  of  0  and  -ES(X)  is  the  so- 

called  observed  information.  To  obtain  our  desired  result,  we  need 

to  show  that 

-£g(X)  -  Eg(-t0(X) f C } 

- 75 - 


0  in  probability. 


To  show  this,  we  shall  make  vise  of  the  following  two  facts  : 

(a)  Ee{-t§(X)|C>  *  -S0CX), 

(b)  E.{n(§  -  ©) 2  | C>  0  in  probability, 

the  latter  being  proved  by  Efron  and  Hinkley  (1978).  By  a  Taylor's 
series  expansion,  we  have  1  (X)  *  iu(X)  ♦  (0  -  6)1*  (X)  where  1 9 1  -  6 1 
<  1 6  -  0 1 .  Assume  that  |*A'  (Xj)  |  <  M(Xj)  for  0i  in  a  neighbourhood  of  e, 
and  with  E0{M(Xi)}  =  p<®.  Then,  with  probability  1, 

M(X. ) 

£  EeC^n|9  -  elJ—jji-IC} 

M(X.) 

=  e{p|c}  *Ee{Vnje-  §| cl — pp—  -  v)  |c). 

The  proof  is  completed  by  applying  the  Cauchy-Schwartz  inequality  to 
the  last  two  expectations  and  then  using  (b) . 

(iv)  The  conditions  are  those  that  guarantee  the  equality  of 
the  asymptotic  variance  with  the  limit  of  the  variances.  A  set  of 
such  conditions  can  be  found  in  Zacks  (1971,  p.244).  In  our  case, 
the  conditions  can  be  simplified  somewhat  because  E{I(C)}  =  I  *  so 
that  the  bias  is  zero. 


-*g(X)  -E0{-£e(X)|C} 
- ^ - 


4.3  Exponential  models 

Another  general  class  of  models  for  which  exact  ancillary  statistics 
exist  is  the  exponential  class  with  a  cut  (Bamdorff-Nielsen,  1978,  1980). 

Consider  the  full  exponential  model  of  order  k 

p(x;X)  =  exp{X-t(x)  -  ic(X)  -  h(x)}. 


Here  X  and  t  are  both  vectors  of  dimension  k  and  X-t  denotes  their  dot 
product.  If  X  =  X (9)  where  0  is  d-dimensional,  then  we  have  a  d-dimensional 


curved  subfamily  of  the  full  model.  If  we  let  t(X)  =  E  (t)  and 

A 

partition  both  t  and  t  into  two  components,  say  t  =  (u,v) ,  t  *  (£,r>), 
such  that  u  and  £  *  E(u)  are  (k  -  d) -dimensional  while  v  and  n  =  E(v) 
are  d-dimensional,  then  the  constraint  C  =  also  defines  a  d-dimensional 
curved  subfamily.  In  this  case,  we  can  take  0  to  be  the  second  component 
of  the  similar  partition  of  X,  that  is,  X  =  (x,0)  with  x  some  function 
of  6.  If  x  is  of  the  form  x  *  a(£)  +  B(0)  for  some  function  a  and  B, 
then  u  is  a  cut  and  then  C  =  u(X)  is  exactly  ancillary.  A  number  of 
examples  of  cuts  are  given  by  Bamdorff-Nielsen  (1978,  §10.2).  If  we 

have  n  independent  observations  X; ,  _ ,  X^,  then  Zt(X^)  is  sufficient 

and  C  =  Zu(X^)  is  ancillary. 

The  following  example,  due  to  Bamdorff-Nielsen  (1980)  and  used 
by  Buehler  (1982)  has  an  ancillary  statistic  which  is  not  an  exact 
precision  index  but  is  neverthless  an  approximate  precision  index. 

Example  8.  Let  (Xj.Yi),  ....  (X^Y^)  be  independent  observations 
from  the  correlated  bivariate  gamma  density  (3.1).  If  a,  0  are  restricted 
by  a  -  6  1  =  a  for  any  a>0,  the  bivariate  density  becomes 

f(x,y;0)  =  a0Io(2*/xy)exp{-ax  -  (x/0)  -  0y) 

and  the  marginal  distribution  of  x  is  Exp(a).  The  minimal  sufficient 
statistic  is  S  =  (ZX^.ZY^)  and  C  =  ZX^  Gamma(n,a)  is  ancillary. 

Buehler  (1982)  investigates  the  case  n=l  and  finds  that  the  precision 
of  estimation  depends  not  only  on  C  but  also  on  0  and  that  this  property 
remains  true  no  matter  how  0  is  transformed.  Since,  there  is  no 
paramet rication  such  that  the  conditional  Fisher  information  is  free 
of  the  parameter,  C  is  not  an  exact  precision  index.  However,  numerical 
calculations  indicate  that  C  is  an  approximate  precision  index. 


The  next  theorem,  which  should  be  compared  with  Theorem  1,  shows 


that  this  phenomenon  holds  true  in  general. 

THEOREM  2.  For  exponential  families  with  a  out, 

(i)  the  statistical  curvature  ya  is  not  constant , 

D 

(ii)  var{I0(C)}  =  ny*!*  where  JQ  -  EQ{-Ae(X1)  }, 

(Hi)  the  ancillary  statistic  C  *  Eu(X^)  is  not  an  exact  precision 
index , 

( iv )  (l/n)T  (C)  =  1  +  e  ,  e  ■+  0  a.s.,  where  9(6)  is  the  variance 

9  n  n 

stabilizing  transformation  for  I  . 

0 

Proof,  (i)  Let 


where  «...  L  denote  the  first  and  second  derivatives  of  *  (Xi);  then 

Ye  =  In  our  case 

p(x;0)  =  exp(x(8) -u  +  6v  -  k(a(0))  -  h(x)}, 

where  A(0)  =  (x(0)»0]»  so  that 

iQ  -  x'  (0)  •  (u  -  Co)  +  (v  -  n) , 

ie  X"(e)-(u-  C0)  -  3n/36. 

From  Bamdorff-Niel sen’s  (1980)  result  that  i0(Xi)  is  uncorrelated 
with  u(Xi) ,  we  have 

Ee*'e^e  55  c°v(i0,ii0)  =  0 

and 

{VV2e‘^}/Te-  var^/Te' 


since  I0  *  -E0ii0.  Thus  y|  is  not  constant. 

(ii)  3£0(X)  «  Ei'gCX^  -  x,,(e)  :(Iu(Xi)  -nCo)  -n3n/ae  is  a 
function  of  C  *  Tu(X^)  and  so 


Ie(C)  »  E0{-£e(X)|C}  -  -i‘e(x) 

and 

var{70(C)}  «  var{-£0(X)>  *  n  var{-£0(Xi)) 

■  Bi'J- 


(iii)  From  (ii)»  Yg  *  var(I0(C) }/nI* .  Since  the  statistical 

curvature  is  invariant  under  reparametrization,  there  exists  no 

parametrization  such  that  I  and  I  .(C)  are  free  of  6.  In  other  words 

6  G 

C  is  not  an  exact  precision  index. 


(iv)  Let  $(6)  be  the  variance  stabilizing  transformation  for 
!e,  so  that  l'nl  =  a.  Since  Ie(C)  .  -«e(X)  »  -StjfXj), 

V«  * 

and 


il,(Q  -  l-il£e(Vl/7e 


s  1  ♦  e  ,  where  e  ■*  0  a.s. 
n’  n 


In  this  sense,  C  can  be  regarded  as  an  approximate  precision  index. 

4 . 4  DiBOussion 


In  his  discussion  of  (Efron,  1975),  Cox  (1975)  comments  that 


"Existence  of  an  approximate  ancillary  must  be  connected  with  the 
approximate  constancy  of  yn  as  a  function  of  0;  it  would  be  good  to 


have  the  connexions  explored."  Cox’s  comment  seems  to  suggest  that 
the  existence  of  an  ancillary  statistic  is  connected  with  the  constancy 
of  statistical  curvature.  This  cannot  be  correct  since  an  exact 
ancillary  statistic  exists  for  exponential  families  with  a  cut.  Our 
results  seems  to  indicate  that  the  constancy  of  statistical  curvature 
is  related  to  the  concept  of  exact  precision  index.  Perhaps  the  clearest 
indication  of  this  is  in  the  proof  of  part  (iii)  of  Theorem  2  where  the 
non-constancy  of  y.  leads  us  to  conclude  that  C  cannot  be  an  exact 
precision  index. 


5.  MISCELLANEOUS  REMARKS 

As  in  most  papers  on  this  topic,  we  deal  with  scalar  6;  this  is 
for  the  sake  of  simplicity  and  also  because  we  believe  that  it  is 
often  the  simple  examples  which  shed  light  on  a  concept.  In  principle, 
we  can  also  deal  with  the  multiparameter  case.  An  example  of  a  multi¬ 
parameter  transformation  model  is  the  hyperboloid  distribution  with 
fixed  value  of  the  "concentration  parameter".  That  this  is  the  case 
is  not  obvious ;  Bamdorff-Nielsen  (1980)  attributes  the  proof  to 
J.L.  Jensen.  On  the  other  hand,  an  example  of  a  two-parameter 
exponential  family  for  which  an  exact  ancillary  exists  is  due  to 
G.W.  Cobb,  cited  by  Hinkley  (1980).  The  generalization  of  the  concept 
of  exact  precision  index  is  not  as  obvious,  because  covariance  stabilizing 
transformations  do  not  exist  in  general  (Holland,  1973). 

We  also  do  not  deal  with  nuisance  parameters.  For  problems 
invariant  under  a  group  of  transformations  G,  if  v(6)  is  not  a  constant 
function,  then  v(9)  becomes  a  nuisance  parameter.  Examples  are  model 
II  of  analysis  of  variance  and  the  problem  of  estimating  the  common 
mean  of  two  normal  distributions  N(p,o^),  Nfy.o^)  based  on  two  samples 
of  equal  size.  For  a  treatment  of  equivariant  estimation  in  the 
presence  of  a  nuisance  parameter,  the  reader  is  referred  to  Zacks  (1971, 
S7.7)  where  Bayes  equivariant  and  fiducial  estimators  are  derived  for 
the  examples  just  mentioned. 

The  author  is  most  grateful  to  Professor  J.  Aitchison  for  his 


helpful  suggestions. 
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